Search CORE

188 research outputs found

Belief Tree Search for Active Object Recognition

Author: Cottrell Garrison W.
Malmir Mohsen
Publication venue
Publication date: 13/08/2017
Field of study

Active Object Recognition (AOR) has been approached as an unsupervised learning problem, in which optimal trajectories for object inspection are not known and are to be discovered by reducing label uncertainty measures or training with reinforcement learning. Such approaches have no guarantees of the quality of their solution. In this paper, we treat AOR as a Partially Observable Markov Decision Process (POMDP) and find near-optimal policies on training data using Belief Tree Search (BTS) on the corresponding belief Markov Decision Process (MDP). AOR then reduces to the problem of knowledge transfer from near-optimal policies on training set to the test set. We train a Long Short Term Memory (LSTM) network to predict the best next action on the training set rollouts. We sho that the proposed AOR method generalizes well to novel views of familiar objects and also to novel objects. We compare this supervised scheme against guided policy search, and find that the LSTM network reaches higher recognition accuracy compared to the guided policy method. We further look into optimizing the observation function to increase the total collected reward of optimal policy. In AOR, the observation function is known only approximately. We propose a gradient-based method update to this approximate observation function to increase the total reward of any policy. We show that by optimizing the observation function and retraining the supervised LSTM network, the AOR performance on the test set improves significantly.Comment: IROS 201

arXiv.org e-Print Archive

Crossref

EMPATH: A Neural Network that Categorizes Facial Expressions

Author: Adolphs Ralph
Cottrell Garrison W.
Dailey Matthew N.
Padgett Curtis
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2002
Field of study

There are two competing theories of facial expression recognition. Some researchers have suggested that it is an example of "categorical perception." In this view, expression categories are considered to be discrete entities with sharp boundaries, and discrimination of nearby pairs of expressive faces is enhanced near those boundaries. Other researchers, however, suggest that facial expression perception is more graded and that facial expressions are best thought of as points in a continuous, low-dimensional space, where, for instance, "surprise" expressions lie between "happiness" and "fear" expressions due to their perceptual similarity. In this article, we show that a simple yet biologically plausible neural network model, trained to classify facial expressions into six basic emotions, predicts data used to support both of these theories. Without any parameter tuning, the model matches a variety of psychological data on categorization, similarity, reaction times, discrimination, and recognition difficulty, both qualitatively and quantitatively. We thus explain many of the seemingly complex psychological phenomena related to facial expression perception as natural consequences of the tasks' implementations in the brain

CiteSeerX

Caltech Authors

Recommended from our members

Learning Simple Arithmetic Procedures

Author: Cottrell Garrison W.
Tsung Fu-Sheng
Publication venue: eScholarship, University of California
Publication date: 01/01/1989
Field of study

Two types of simple recurrent networks (Jordan, 1986; Elman, 1988) were trained and compared on the task of adding two multi-digit numbers. Results showed that: (1) A manipulation of the training environment,called Combined Subset Training (CST), was found to be necessary to learn the large set of patterns used; (2) if the networks are viewed as learning simple programming constructs such as conditional branches, while-loops and sequences, then there is a clear way to demonstrate a capacity difference between the two types of networks studied. In particular, we found that there are programs that one type of network can perform that the other cannot. Finally, an analysis of the dynamics of one of the networks is described

eScholarship - University of California

Recommended from our members

Modeling the Role of Phonetic Knowledge in Learning to Read Aloud

Author: Cottrell Garrison W .
Milostan Jeanne C.
Publication venue: eScholarship, University of California
Publication date: 01/01/1996
Field of study

eScholarship - University of California

Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition

Author: Cohen Scott
Cottrell Garrison W.
Lin Zhe
Shen Xiaohui
Wang Yufei
Publication venue
Publication date: 23/04/2017
Field of study

Recently, there has been a lot of interest in automatically generating descriptions for an image. Most existing language-model based approaches for this task learn to generate an image description word by word in its original word order. However, for humans, it is more natural to locate the objects and their relationships first, and then elaborate on each object, describing notable attributes. We present a coarse-to-fine method that decomposes the original image description into a skeleton sentence and its attributes, and generates the skeleton sentence and attribute phrases separately. By this decomposition, our method can generate more accurate and novel descriptions than the previous state-of-the-art. Experimental results on the MS-COCO and a larger scale Stock3M datasets show that our algorithm yields consistent improvements across different evaluation metrics, especially on the SPICE metric, which has much higher correlation with human ratings than the conventional metrics. Furthermore, our algorithm can generate descriptions with varied length, benefiting from the separate control of the skeleton and attributes. This enables image description generation that better accommodates user preferences.Comment: Accepted by CVPR 201

arXiv.org e-Print Archive

Crossref

An examination of simultaneous lineup identification decision processes using eye tracking

Author: Garrison W. Cottrell (7246286)
Heather Flowe (1259874)
Publication venue
Publication date: 01/01/2011
Field of study

An examination of simultaneous lineup identification decision processes using eye trackin

Loughborough University Institutional Repository

University of Birmingham Research Portal